The dotdensity R package provides dot-density functions can be used with any kind of data, but it has been designed with hierarchical geographic data, like census data, in mind.

Age groups

For this example we are interested in the geographic distrbution of ages groups. We will make use of the cancensus package to obtain census data on the age group data for the City of Vancouver and Point Grey Penninsula (the area to the west of the city).

#devtools::install_github('mountainmath/dotdensity')
library(dotdensity)
#devtools::install_github('mountainmath/cancensus')
library(cancensus)
# options(cancensus.api_key)='<your API key>'

Using the CensusMapper API tool we select the region and variables we need.

dataset='CA16'
regions=list(CT=c("9330069.01","9330069.02"),CSD=c("5915022","5915803")) #list(CMA="59933")
vectors=c("v_CA16_4","v_CA16_64","v_CA16_82","v_CA16_100","v_CA16_118","v_CA16_136","v_CA16_154","v_CA16_172","v_CA16_190","v_CA16_208","v_CA16_226","v_CA16_244")
census_data <- cancensus.load(dataset='CA16', regions=regions, vectors=vectors, level='CT')
OGR data source with driver: GeoJSON 
Source: "data_cache/CM_geo_7d530bb2f6912fbf45cde8f00d7e3134.geojson", layer: "OGRGeoJSON"
with 8 features
It has 14 fields

We have defined a convenience function prep_data that renames variables.

Armed with that we load in the geographic data for the dissemination blocks and the census data for the dissemination areas. As a pre-caution the census tract level data in case we are missing dissemmination block level data due to privacy or quality concerns, which in this particular case wasn’t the case.

data_ct <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="CT") %>% prep_data
OGR data source with driver: GeoJSON 
Source: "data_cache/CM_geo_82d598f6dd4811b948d8cc7aaac03edd.geojson", layer: "OGRGeoJSON"
with 120 features
It has 11 fields
data_da <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="DA") %>% prep_data
OGR data source with driver: GeoJSON 
Source: "data_cache/CM_geo_63e21d24e30bf0c7ffe020b6f6208adc.geojson", layer: "OGRGeoJSON"
with 1013 features
It has 10 fields
data_db <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="DB") 
OGR data source with driver: GeoJSON 
Source: "data_cache/CM_geo_64e0f25c6ce0caf01160ed4ce2d34c25.geojson", layer: "OGRGeoJSON"
with 4723 features
It has 10 fields

Mapping

The categories we want to map consist of all the loaded census language variables. We pick colours to represent these, decide on a scale (how many dots per household) to map as well as the opacity and size of each dot.

# Set the categorie we want to map. Those are the labels except we want to replace the "Total" with the "Other" column
categories=attributes(data_ct)$dot_labels$Detail
colors=c("#0000ff", "#ff0000", "#ffff00", "#00ff00", "#00ffff")
scale=10
alpha=0.75
size=0.5

All that’s left to do is re-aggregate the data and compute the dot locations and map them.

data_da@data <- dot_density.proportional_re_aggregate(data=data_da@data,parent_data=data_ct@data,geo_match=setNames("GeoUID","CT_UID"),categories=categories,base="Population")
data_db@data <- dot_density.proportional_re_aggregate(data=data_db@data,parent_data=data_da@data,geo_match=setNames("GeoUID","DA_UID"),categories=categories,base="Population")
dots.db <- dot_density.compute_dots(geo_data = data_db, categories = categories, scale=scale)
basemap + dot_density.dots_map(dots=dots.db,alpha=alpha,size=size)

Takeaway

By changing a couple of lines of code in the previous example about languages spoken at home in Vancouver and taking out explanatory steps we could easily build a dot-density map of age groups. Using cancensus, pulling in the relevant data was a breeze, and the dotdensity package did all the relevant dot-density calculations for us.

---
title: "Age groups"
author: "Jens von Bergmann"
date: "2017-08-10"
output: html_notebook
vignette: >
  %\VignetteIndexEntry{Vignette Title}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

The [`dotdensity` R package](https://github.com/mountainMath/dotdensity) provides dot-density functions can be used with any kind of data, but it has been designed with hierarchical
geographic data, like census data, in mind. 

## Age groups
For this example we are interested in the geographic distrbution of ages groups. 
We will make use of the [cancensus](https://github.com/mountainMath/cancensus) package to obtain census
data on the age group data for the City of Vancouver and Point Grey Penninsula (the area to the west of the city).

```{r}
#devtools::install_github('mountainmath/dotdensity')
library(dotdensity)
#devtools::install_github('mountainmath/cancensus')
library(cancensus)
# options(cancensus.api_key)='<your API key>'
```



Using the [CensusMapper API tool](https://censusmapper.ca/api/CA16) we select the region and
variables we need.
```{r, message=FALSE, warning=FALSE}
dataset='CA16'
regions=list(CT=c("9330069.01","9330069.02"),CSD=c("5915022","5915803")) #list(CMA="59933")
vectors=c("v_CA16_4","v_CA16_64","v_CA16_82","v_CA16_100","v_CA16_118","v_CA16_136","v_CA16_154","v_CA16_172","v_CA16_190","v_CA16_208","v_CA16_226","v_CA16_244")
census_data <- cancensus.load(dataset='CA16', regions=regions, vectors=vectors, level='CT')

```

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.height=4, fig.width=4}
theme_opts<-list(ggplot2::theme(panel.grid.minor = ggplot2::element_blank(),
                       panel.grid.major = ggplot2::element_blank(),
                       panel.background = ggplot2::element_rect(fill = 'light blue', colour = NA),
                       plot.background = ggplot2::element_rect(fill="light grey",
                       size=1,linetype="solid",color="black"),
                       axis.line = ggplot2::element_blank(),
                       axis.text.x = ggplot2::element_blank(),
                       axis.text.y = ggplot2::element_blank(),
                       axis.ticks = ggplot2::element_blank(),
                       axis.title.x = ggplot2::element_blank(),
                       axis.title.y = ggplot2::element_blank(),
                       plot.title = ggplot2::element_text(size=22)))

base_geom <- cancensus.load(geo_format='sp',dataset=dataset, regions=regions, level="Regions")

basemap <-   ggplot2::ggplot(base_geom) +
    ggplot2::geom_polygon(ggplot2::aes(long, lat, group = group), fill = "white", size=0.1) +
    #ggplot2::geom_polygon(ggplot2::aes(long, lat, group = group), colour = "#222222", fill = "white", size=0.1) +
    ggplot2::guides(colour = ggplot2::guide_legend(override.aes = list(size=2))) +
    ggplot2::labs(color = "label",caption="Source: StatCan Census 2016 via cancensus & CensusMapper.ca") +
    ggplot2::coord_map(projection="lambert", lat0=49, lat=49.4) +
    theme_opts
```


We have defined a convenience function `prep_data` that renames variables.
```{r, message=TRUE, warning=TRUE, include=FALSE}
# rename columns for better readability and compute aggregates
prep_data <- function(geo){
  data <- geo@data %>% replace(is.na(.), 0) %>%
    mutate(
      !!"0-19" := v_CA16_4 + v_CA16_64,
      !!"20-34" := v_CA16_82 + v_CA16_100 + v_CA16_118,
      !!"35-49" := v_CA16_136 + v_CA16_154 + v_CA16_172,
      !!"50-64" := v_CA16_190 + v_CA16_208 + v_CA16_226,
      !!"65+" := v_CA16_244
           )
  
  ls=c("0-19","20-34","35-49","50-64","65+")
  labels <- tibble(Vector=ls,Detail=ls)
  

  geo@data <- data
  attributes(geo)$dot_labels <- labels
  return(geo)
}
```
Armed with that we load in the geographic data for the dissemination blocks and the census data for the dissemination areas. As a pre-caution the census tract level data in case we are missing dissemmination block level data due to privacy or quality concerns, which in this particular case wasn't the case.
```{r, message=FALSE, warning=FALSE}

data_ct <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="CT") %>% prep_data
data_da <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="DA") %>% prep_data
data_db <- cancensus.load(geo_format='sp',labels='short',dataset=dataset, regions=regions, vectors=vectors,level="DB") 
```
## Mapping

The categories we want to map consist of all the loaded census language variables. We pick colours to represent these, decide on a scale (how many dots per household) to map as well as the opacity and size of each dot.
```{r}
# Set the categorie we want to map. Those are the labels except we want to replace the "Total" with the "Other" column
categories=attributes(data_ct)$dot_labels$Detail
colors=c("#0000ff", "#ff0000", "#ffff00", "#00ff00", "#00ffff")
scale=10
alpha=0.75
size=0.5

```
```{r, echo=FALSE, message=FALSE, warning=FALSE}
# set map title using the scale and colour values
title=paste0("People per Age Group\n1 dot = ",scale," people")
basemap <- basemap + ggplot2::scale_colour_manual(title,values = colors) 
```


All that's left to do is re-aggregate the data and compute the dot locations and map them.
```{r, fig.width=12}
data_da@data <- dot_density.proportional_re_aggregate(data=data_da@data,parent_data=data_ct@data,geo_match=setNames("GeoUID","CT_UID"),categories=categories,base="Population")
data_db@data <- dot_density.proportional_re_aggregate(data=data_db@data,parent_data=data_da@data,geo_match=setNames("GeoUID","DA_UID"),categories=categories,base="Population")

dots.db <- dot_density.compute_dots(geo_data = data_db, categories = categories, scale=scale)
basemap + dot_density.dots_map(dots=dots.db,alpha=alpha,size=size)
```
##Takeaway
By changing a couple of lines of code in the [previous example about languages spoken at home in Vancouver](https://github.com/mountainMath/dotdensity/blob/master/vignettes/languages-example.Rmd) and taking out explanatory steps we could easily build a dot-density map of age groups. Using `cancensus`, pulling in the relevant data was a breeze, and the `dotdensity` package did all the relevant dot-density calculations for us.
